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Abstract 

We consider a range of "theories" that violate the uncertainty relation for anti-comniuting 
observables derived in [JMP, 49, 062105 (2008)]. We first show that Tsirelson's bound for the 
CHSH inequality can be derived from this uncertainty relation, and that relaxing this rela- 
tion allows for non-local correlations that are stronger than what can be obtained in quantum 
mechanics. We continue to construct a hierarchy of related non-signaling theories, and show 
that on one hand they admit superstrong random access encodings and exponential savings 
for a particular communication problem, while on the other hand it becomes much harder in 
these theories to learn a state. We show that the existence of these effects stems from the 
absence of certain constraints on the expectation values of commuting measurements from our 
non-signaling theories that are present in quantum theory. 

1 Introduction 

In any physical theory, we may consider measurements M that when applied to a state p result 
in some measurement outcome k with probability P{k\M), depending on p. A crucial element in 
characterizing the power of any physical theory lies in understanding what probability distributions 
are indeed possible. Quantum theory, for example, imposes strict limits on such distributions, which 
greatly affects our ability to perform information processing tasks [39]. One of these limitations 
is commonly known as an uncertainty relation. We may for example ask whether for some fixed 
choice of measurements Mi and M2 there even exists any state such that both distributions can 
be arbitrarily well defined. That is, is it possible that there exist outcomes ki and k2 such that 
P(A;i|Mi) = P{k2\M2) = 1? Curiously, it turns out that in quantum theory there do indeed exist 
pairs of measurements Mi and M2 for which this is impossible. Another limitation is known as 
the strength of non-local correlations^ which are restrictions on the joint probability distributions 
we can obtain when performing measurements on spatially separated systems. Classically, these 
limitations are known as Bell inequalities, and the corresponding limitations in the quantum case 
are referred to as Tsirelson bounds. 

* gregv@caltech.edu 
^ weliner@caltech.edu 
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Since quantum mechanics imposes very stringent restrictions on the possible distributions |22j . 
we would much like to understand their extent and implications. To this end, it is instructive to 
remove some of these restrictions and investigate how our ability to perform information processing 
tasks changes as a result. In this work, we will relax an uncertainty relation, which greatly affects 
our ability to solve communication and coding tasks. We will also see that the different kinds of 
restrictions are very closely related and show that for example Tsirelson's bound for the CHSH 
inequality is a consequence of the uncertainty relation of |42j. 

1.1 Previous work 

Previous work has focused on investigating one particular restriction imposed by quantum mechan- 
ics, namely its limits on non-local correlations. Indeed, the existence of non-local correlations in 
quantum mechanics that are stronger than those allowed by local realism |9] , but yet strictly weaker 
than those consistent with the no-signaling principle |3T] poses an enigma to the understanding of 
the foundations of quantum physics. What are the properties of quantum mechanics that disallow 
these stronger correlations |24j ? And, what possibilities would be opened by the existence of these 
correlations? Much of the work exploring these questions has focused on the "box paradigm" that 
was initially inspired by the CHSH inequality [TB]. This particular Bell inequality [9, can be cast 
into a form of a simple game between two players, Alice and Bob. When the game starts, Alice 
and Bob are presented with randomly and independently chosen questions s £ {0, 1} and t € {0, 1} 
respectively. They win if and only if they manage to return answers a G {0, 1} and b G {0, 1} such 
that s-t = a(Bb. Alice and Bob may thereby agree on any strategy before the game starts, but may 
not communicate afterwards. Classically, that is in any model based on local realism, this strategy 
consists of shared randomness. It has been shown [16] that for any such strategy we have 
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7:=- Yl Pr[s-t = a,e6t] < 

s,t6{0,l} 

where Pr[s • t = © bt] is the probability that Alice and Bob return winning answers and bt 
when presented with questions s and t. Quantumly, Alice and Bob may choose any shared quantum 
state together with local measurements as part of their strategy. This allows them to violate the 
inequality above, but curiously only up to a value 
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known as Tsirelson's bound jl4| I15| . We will see later that there exists a state |^')yi_B shared by 
Alice and Bob that achieves this bound when Alice and Bob perform measurements given by the 
observables = -Bq = ^ and Ai = Bi = Z where we use Ag and Bt to denote the measurement 
corresponding to questions s and t respectively. The non-signaling principle that disallows faster 
than light communication between Alice and Bob alone does not impose such a restrictive bound. 
Hence, Popescu and Rohrlich |3H l32l l33] raised the question why nature is not more 'non-local' ? 
That is, why does quantum mechanics not allow for a stronger violation of the CHSH inequality 
up to the maximal value of 1? To gain more insight into this question, they constructed a toy- 
theory based on so-called PR-boxes [H]. Each such box takes inputs s,t £ {0,1} from Alice 
and Bob respectively and simply outputs randomly chosen measurement outcomes as,bt such that 
s ■ t = as (B bt- Each such box can be used exactly once, and no notion of post-measurement states 
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exists. Note that Alice and Bob still cannot use this box to transmit any information. However, 
since we have for all s and t that Pr[s ■ t = as (B bt] = 1, Tsirelson's bound is clearly is violated. It 
is interesting to consider how our ability to perform information processing tasks changes, if PR- 
boxes indeed existed. For example, it has been shown that Alice and Bob can use such PR-boxes 
to compute any Boolean function / : {0, 1}^" {0, 1} of their individual inputs x E {0, 1}" and 
y G {0,1}"" by communicating only a single bit j39j, which is even true when the boxes have slight 
imperfections 

Much interest has since been devoted to the study of such PR-boxes and their generalizations 
known as non-local boxes \23\ \T3\ \T9\ l28l |5l El |27]. In particular, they have been incorporated in 
a very nice way into generalized non-signaling theories (GNST) due to Barrett [8j (the relation of 
such theories to generalizations of quantum theory is due to Hardy [2l]) as a means of exploring 
foundational questions in quantum information. Intuitively, such theories allow for "boxes" involv- 
ing many more inputs for one or more players/systems, and also allow for some transformations 
between such boxes. Both theories seek out physically motivated properties that single out quan- 
tum mechanics from other theories such as the classical world. These theories have also found 
interesting applications in deriving new bounds for quantum mechanics itself, e.g., monogamy of 
entanglement [57] . 

In such a theory, n-partite states are characterized by the probabilities of obtaining certain 
outcomes when performing a fixed set of local fiducial measurements on each system. For example, 
to describe a non-local box, consider a bipartite system, where Alice holds the first and Bob the 
second system. We will label both Alice and Bob's measurements using X and Z in analogy 
to the quantum setting. For convenience we will also label the outcomes using a,b €z {0,1}, 
where the actual outcomes of X and Z in the quantum setting could be recovered as (—1)°, and 
use p{A\M) to denote the probability of obtaining outcomes A for measurements M. A non- 
local box is now given by the probabilities p{0,0\X, X) = p(0,0\X, Z) = p{0,0\Z, X) = 1/2, 
p{l,l\X,X) =p{l,l\X,Z) =p{l,l\Z,X) = 1/2, p{0,l\Z,Z) =p{l,0\Z,Z) = 1/2 and p{A\M) = 
otherwise. We will describe such theories in more detail in section |4j We will also refer to GNST 
using the commonly used term "box- world" . 

1.2 Relaxed uncertainty relations 

Even when allowing more than two measurements and outcomes, such boxes remain very artificial 
constructs and it is not quite clear how they relate to quantum theory. In this note, we hope 
to provide a more intuitive understanding by showing that superstrong correlations can indeed 
be obtained by relaxing an uncertainty relation known to hold in quantum theory. Consider any 
anti-commuting observables Fi, . . . ,T2n satisfying 

{ri,Fj = o 

whenever j ^ k and 

F^ = I 

for any j G [2n], and let Fq = ^Fi . . . T2n (see section |2] on how to construct such operators). It was 
shown in [12] that any quantum state obeys 

2n 

^TY(F,pf<l, (1) 

j=0 
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which also lead to several entropic uncertainty relations for such observables. To see why Eq. ([T]) 
itself can be understood as an uncertainty relation note that Ti(Tjp) is the expectation value of 
measuring the observable Tj on p. The probability of obtaining a measurement outcome b G {±1} 
can furthermore be written as p{b\Tj) = 1/2 + bTT(Tjp)/2. Hence, Tr(rjp) can also be understood 
as the bias towards a particular measurement outcome. Eq. ([T]) now tells us that this bias cannot 
be arbitrarily large for all measurements Tj . Note that we could rewrite the condition of Eq. ([T]) as 
\\v\\2 < 1 where v = {Tt{Tip), . . . ,Tr(r2n/o))- Whereas the uncertainty relations of [42_ may appear 
unrelated to the problem of determining the strength of non-local correlations, we will see later 
that Tsirelson's bound for the CHSH inequality is in fact a consequence of Eq. ([T]), when we use 
the fact that local anti-commutation and maximal violations of the CHSH inequality are closely 
related jl4| [38] Thus, as one might intuitively guess, bounds for the strength of non-local 
correlations are indeed closely related to uncertainty relations, and such connections have been 
observed in a different form by |28l [8] . 

What happens if we merely ask for \\v\\p < 1, where || • ||p is the p-norm of the vector v? 
Since Eq. ([T]) must hold for any quantum state, that is for any positive semi-definite matrix p 
with Tr(p) = 1, it is clear that this allows operators p which are no longer positive semi-definite. 
In the spirit of Barrett's GNST, we will however restrict ourselves to allowing a particular set of 
fiducial measurements only, for which the probabilities will remain positive and thus well-defined. 
In section [3j we will describe a hierarchy of such "theories" in detail, and investigate their power 
with respect to non-local correlations and information processing problems. In particular, we will 
see that 



For the CHSH inequality, we can obtain at most 



7 = — I , , , for \\v\\^ < 1. 

' 2 2(2) Vp " - 

where in the limit of p ^ oo the right-hand side becomes 1, and we have a state that acts 
analogous to a non-local box. 

• Furthermore, any unique XOR-game can be played with perfect success for p ^ oo. 

It is instructive to consider what our relaxed uncertainty relation means in the case of a single 
qubit. Note that for quantum mechanics we have p = 2 in which case Eq. ([T]) corresponds to the 
statement that v must lie inside the Bloch sphere. Allowing different values of p now constraints 



us to the corresponding p-spheres as depicted in Figure 1.2 It is interesting to consider that even 
though for p > 2 we obtain non-local correlations that are stronger than what quantum theory 
allows, we now have a weaker uncertainty relation than in quantum theory. It has previously 
been noted by Barrett [S] that GNST has no uncertainty relations for particular measurements. 
Our work makes this relation very intuitive. In particular, for the case of p — > oo corresponding 
to a non-local box we essentially place no restrictions on the bias TT{Tjp) at all. Since Eq. ([T]) 
leads to the entropic uncertainty relations on which the security of the protocols in the bounded- 
quantum-storage model |iT71 [T8l H3] is based, it may be worth considering how certain cryptographic 
tasks change in the setting of non-local boxes. Indeed, it has recently been shown [44j that privacy 
amplification fails in a world based on non-local boxes. Whereas it is known that cryptographic tasks 
such as bit commitment and oblivious transfer are compatible with the no-signaling principle |T3] , 
little is known about them in general theories [7]. 
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Figure 1: p-norm unit circles in dimension 2 for p = 1,2,3, 10, 10000 



It should be noted that except for a single qubit, Eq. ([T| is of course only a necessary and 
not a sufficient condition for /? > 0. In higher dimensions, such relations are much more involved, 
but have been obtained for certain operators |26| ITUI UT\ and also some operators relating more 
closely to unbiased measurements ^T. Relaxing this particular uncertainty relation is thus only 
one way to go. Yet, due to the rich structure of the Clifford algebra of operators Ti, . . . ,r2n and 
their central importance for entropic uncertainty relations and so-called XOR non-local games (also 
known as two-party correlation inequalities) with 2 measurement outcomes, this small relaxation 
allows us to gain some insights into their role in quantum information processing tasks. 

1.3 Information processing in generalized non-local theories 

Inspired by these relaxations in terms of an operator p, we then construct a hierarchy of p-GNST 
theories exhibiting similar constraints. For such theories, we identify a single gbit (defined in [8 ) 
with a single qubit obeying the relaxed uncertainty relations above. That is, we will think of a 
single gbit as allowing three fiducial measurements labeled X, Z and Y in analogy to the quantum 
case. Whereas this choice is of course again quite arbitrary, and heavily inspired by the quantum 
setting, it will allow us to gain a slightly better understanding of the relation of "box-world" and 
quantum theory later on. We show that the states we allow above, as well as states in p-GNST's 
have several properties that set them apart from quantum theory. In particular, we will see that 

• In p-GNST, there exists superstrong random access encodings. For example, there exists 
an encoding of = 3" bits into (2n -|- l)^/^n gbits such that we can retrieve any bit with 
probability 1—e for e = 2 exp(— (2n-|-l)"^/P/2). Quantumly on the other hand it is known that 
we require at least (1 — h{\ — £))N qubits to encode N classical bits with the same recovery 
probability, where h denotes the binary Shannon entropy. 

• As a consequence, in p-GNST there exist single server FIR scheme with 0(polylog(A^)) bits 
of communication for an N bit database with large N, whereas quantumly 0,{N) bits are 
needed. 
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• On the other hand, we show that in GNST it becomes much harder to learn a state in the 
sense of [I]. In fact, unhke in the quantum setting, we can essentially not ignore even a small 
part of the information we are given about a state. 

Note that we thereby compare units of information, gbits vs. qubits, irrespective of a physical 
dimension, where gbits were previously defined in [8l. It may not be surprising that such effects 
exist for Hermitian operators p, when all we essentially demand is that the condition ||f||p < 1 is 
obeyed for any set of anti-commuting measurements. However, it will be interesting to consider 
why for example the superstrong random access code encodings we find above are disallowed in 
quantum theory, but allowed in GNST. 

1.4 Commuting measurements 

Although the results of local measurements suffice to describe quantum states p31 , our results sug- 
gest that building a toy-theory around local measurements acting on fixed systems alone (such as 
GNST) may miss part of the flavor when considering some applications. Quantum mechanics has a 
rich structure of commuting and anti-commuting measurements built in which make no particular 
reference to locality. Uncertainty relations impose restrictions for non-commuting measurements, 
such as for example the anti-commuting measurements Ti, . . . ,T2n- However, we will see in sec- 



tion 2.4 that also certain sets of commuting measurements cannot have arbitrary expectation values 
when measured on a particular state p. As a simple example, consider a 2 qubit system shared 
between Alice and Bob, and consider the measurement X 1, I (8) X and X (8) X. Suppose that we 
have Tr((X = Tr((I (8 X)p) = 1. This tells us that when Alice and Bob measure X locally, 
they obtain an outcome of '1' each with probability 1. However, the measurement of X(SiX can very 
intuitively be viewed as Alice and Bob performing a local measurement of X and taking the product 
of their outcomes. Hence, we do not expect a simultaneous assignment of Tt{(X i^X)p) = —1 to be 



consistent with the previous two expectation values. We will formalize this intuition in section 2.4 
where we will derive a series of conditions such expectation values must obey which in spirit is 
similar to f22]. 

GNST does satisfy these conditions for measurements that commute because they act on dif- 
ferent subsystems. It does not exhibit any inconsistencies otherwise, as no commutation relations 
are defined for measurements on the same system. The issue of such inconsistencies is further 
circumvented by the simple fact that a non-local box can only be used once, and there is no notion 
of subsequent measurements on the same system. This of course is perfectly adequate for studying 
the strength of non-local correlation between two space-like separated systems for example, and led 
to such perplexing results as |39j- We will however see that it is essentially this lack of additional 
constraints that allows us to form superstrong random access codes for example, and may indicate 
that using "box-world" to investigate the role of the strength of non-local correlations within quan- 
tum theory itself is possibly doomed to fail. It also indicates why defining a consistent notion of 
'post-measurement' states for non-local boxes is quite difficult, since many constraints that would 
allow such a task to succeed are simply not present in box-world. 

To see how box-world differs from quantum theory consider the measurements Mi = X Z, 
M2 = Z®X and M3 = —XZ®XZ. These are related in exactly the same way as the measurements 
we considered above, except that in GNST there is no notion that M\ and M2 commute. Yet, we 
intuitively expect similar conditions to hold as for the measurements above when trying to form 
an analogy to the quantum setting. Indeed, one can easily construct a unitary transformation that 
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maps the measurements Mi,Af2 and M3 into a form analogous to the above, where two of the 
measurements act on different systems. In GNST, however, the separation into different systems 
is always a given, which may lead to difficulties when examining some problems which are not 
really concerned with correlations among two distant systems alone, but to information processing 
in general. 



1.5 Outline 

Whereas we only examine a very small piece of the puzzle, our work hopes to shed some light on 
the relation between uncertainty relations, non-local correlations and the role of above mentioned 
consistency constraints in information processing. In section [2] we first explain the basic concepts 
we need to refer to. commuting measurements in more detail. In section [3] we then define a range 
of simple "theories" obtained by relaxing the uncertainty relation for anti-commuting observables. 
To highlight the analogy with non-local boxes, we then define a range of similar GNST-like theories 
in section |4j In sections [5| |6j and [7] we then investigate the power of such theories with respect 
to non-local correlations, random access codes, and information processing problems respectively. 



In section 2.4 we then investigate why such effects are possible within GNST, but not in quantum 



theory. Table 7.4 summarizes similarities and differences among theories. 



2 Preliminaries 

2.1 Basic concepts 

In the following, we write [n] := {1, . . . ,n} and use X, Z and Y to denote the well-known Pauli 
matrices p9] . We also speak of a string of Paulis to refer to a matrix of the form 

:= ... (2) 

with a = (ai, . . . , a^), h = (61, . . . , &„) and Oj, hj G {0, 1}. We sometimes write the Pauli operator 
acting on subsystem j, with identity on the other subsystems as 

Xj = i^^'-i ®x® 

The Pauli basis expansion of a density matrix p is given by p = (I -|- Y2a b ^abSab) I d, where we 
call Sab the coefficient of Sab- Consider the form f{a,b,a',b') = {a,b') + {a',b), where we write 
(a, 6) = "^jCijbj mod 2. It it straightforward to convince yourself that for any pair Sab and Sa'b' 
either [Sab, Sa'b'] = if f{a,b,a',b') = or {Sab, Sa'b'} = if f{a,b,a',b') = 1. Whereas Eq. Q 
holds for any choice of anti-commuting measurements, it is worth noting that in dimension d = 2" 
we can find at most 2n + 1 anti-commuting operators given by 

for j = 1, . . . , n and Tq = iTi . . . T2n- Note that for n = 1 we have Ti = X , T2 = Z, Tq = Y and 
Eq. ([T| is equivalent to the Bloch sphere condition. We will also need the notion of a p-norm of a 

'Consider U = (1(8) H)CNOT(I ® H) 
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vector V = {vi,. . . , Vn) £ which is defined as 




Note that for p = 2 this is just the Euclidean norm. Of particular interest to us will also be the 
oo-norm defined as Halloo •= hiUp-^^oo I blip which can also be written as 

||w||oo = maxd^il, . . . , l^nl). 



2.2 Probability distributions 

Unlike previous descriptions of general probabilistic theories, our notation must be versatile enough 
to accommodate arbitrary choices of simultaneous commuting measurements, even if they do not 
act on separate subsystems. In quantum mechanics we may choose to measure X (g) X along with 
either X 'S'l,!'^ X , or Z^Z, XZ (8) XZ. We will see that including this flexibility in a more general 
theory leads to new constraints. 

First, we want to consider some finite set of measurements O = {Mi, . . . , Mjv} where without 
loss of generality we assume that each measurement has the same finite set of outcomes A and 
the O is ordered lexiocraphically. Although we initially impose no structure on O, in analogy to 
quantum mechanics we consider certain collections of measurements C C to have some property 
which directly corresponds to simultaneous measurability. In particular, we will consider the set of 
possible experiments 

S:={C A MMi, Mj G C sim(Mi, M,) = 0}, 

where "sim" is a predicate indicating simultaneous measurability that remains to be specified. Of 
particular concern to us will be the probability distributions p over the outcomes A G yl^l'^' of some 
set of simultaneously performed measurements C G £. We use p{A\C) to denote the probability of 
obtaining outcomes A = {Ai,A2, . . . , ^|C|) ^ A^^^^ for measurements C Q O where we wlog take 
C to be ordered lexicographically. For simplicity, we will also write p{Ai, . . . , A„|Mi, . . . , M„) := 
p{{Ai,...,An)\{M,,...,Mn}). 

What conditions do the functions p : x C ^ [0, 1] have to fulfill be a valid probability 

distribution for any experiment C G f ? We require that the following conditions need to be satisfied 
for any probability distribution 

(1) Normalization: VC G ^ ,Z^ag^x|ci p(^|C) = 1- 

(2) Positivity: VC G £:,V^ G A""^^^ ,p{A\C) > 0. 

The next condition may appear unfamiliar at first glance. Intuitively it says that the distributions 
of outcomes we obtain for commuting measurements are independent of what other commuting 
measurements we perform. 

(3) Independence: 

VCC'G^withCCC", p{{A,,...,A\c\)\C)= Yl p{{Ai,...,A\c'\)\C'), 

>l|c|+i,-,^|c'|e-4xic"l 
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where, without loss of generahty, we take the first \C\ outcomes to be associated with the measure- 
ments in C. 

Throughout this text, we explore the result of choosing two different ways of choosing simultane- 
ous measurements. First, we consider simultaneous measurements on distinct systems as reflected 
in the construction of non-local boxes. Second, we consider a more general notion of such mea- 
surements based on commutation relations as in quantum mechanics. Note that in the quantum 
case such sets of mutually commuting measurements induce a partitioning of the Hilbert space into 
different systems in the finite-dimensional setting \36\ |22] . 

Consider the set of measurements Op to be strings of Paulis on n-partite systems as defined in 



section 2.1, The two different notions of simultaneous measurements can now be expressed in two 
different choices of sim(Mj, Mj), leading to two different sets of realizable experiments. To capture 
the first notion, we let 

£l ■■= {C ^Op A VMi, Mj G C local(Mi, Mj) = 0}, 

where local(Mj,iVfj) = if and only if Mj and Mj act on different subsystems. For example, we 
have local(X (8) I, I <8) Z) =0. Second, we let 

£c ■■= {C C Op A VM„ Mj G C [Mi, Mj] = 0}, 

where all commuting measurements are simultaneously observable, as in quantum mechanics. 
Clearly, £l ^ ^C; since two measurements acting on two different subsystems commute. 

When we restrict ourselves to £l we can express the independence condition from above in the 
more familiar form of no-signaling: 



(3') No-signaling: 

yc, C e £l with CqC, p{{Ai,...,A\c\)\C) 



p{{A^,...,A\c'\)\C'). 

A|c|+i,...,A|e/|e.4x|c"l 



Intuitively, the no-signaling condition just dictates that the marginal distribution of a partic- 
ular subset of systems is independent of the measurement choices on a disjoint subset of sys- 
tems. Therefore, we can simplify our description of marginals of no-signaling distributions to just 
p{A G ^^l'-'l|C") = p{A\C), where the measurement choices on other parties are arbitrary. We will 
later see that imposing only the special case of the no-signaling condition, versus the full indepen- 
dence condition of (3), makes a crucial difference in the power of the resulting theory with respect 
to encoding information. 

Example 2.1. Consider the set of local experiments for two parties with A = { — 1,1},0 = 
{Xi, Zi, X2, Z2} . Let the probability distribution p{A\C) be described by the following table. 



A 






(1,1) 


1 
2 


1 

2 


1 

2 







(1,-1) 











1 

2 




(-1,1) 











1 
2 




(-1,-1) 


1 
2 


1 

2 


1 

2 









{Xl,X2} 


{Xl,Z2} 


{Zl,X2} 


{Zl,Z2} 


c 



Clearly, we have positivity, and the sum over each measurement setting (column) is 1. Fi- 
nally, note that the marginal probability distribution for either party is constant, VC G SljVAi G 
A,'}2^^^j^^p{{Ai, A2)\C) = ^, therefore this distribution is no-signaling. 

2.3 Moments 

Any finite, discrete probability distribution has a dual representation in terms of a finite number 
of moments |30]. We define the product of the outcomes A = {Ai, . . . , £ ^^l*""' of a collection 

of measurements C £ as A* = Y[\=i ^« • The moment for this measurement is defined as 

m{C):= Yl PiA\C)A*. (3) 

Note that for the identity measurement this means m(J) = 1 because of normalization. Also, if you 
consider the moment for some subset of C, by the independence principle this definition gives a 
unique value which does not depend on the choice of other measurements made simultaneously. 

Since we will only be concerned with measurements with two outcomes A = {±1}, we now 
restrict ourselves to this case for simplicity. For the measurement of a single observable C = {Mi} 
with outcome Ai G A, we can easily recover the probabilities from the moments as 

p((Ai)|{Mi}) = ^(l + ^im({Mi})). (4) 

In subsequent notation, we will drop the brackets within parentheses when it increases readability. 

Note that we can recover the probability for a specific set of outcomes A G ^^l*-^! and measure- 
ments C £ £ from these moments. Without loss of generality, let C = {Mi, . . . , M„}. 

n 

C'CC i,Mi&C' 

= It E f E f(-^ic') n A n 

= ^ E p(^\c) E n ^^^^ 

Ag^xIc^i C'CC i,Mi<^c' 

The second line simply uses the definition of m{C') and the third line uses the independence 
principle to write p{A\C') in terms of p{A\C), allowing us to move the sum over C' inside. Now 
note that the sum over C can be broken into n sums over whether or not Mi G C . For each Mj, 
if it is in C we get a factor of AiAi, otherwise a factor of 1. 

1 " 

^e^x|ci i=i 
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Because the outcomes can only be ±1, the sum can give us only or 2. 

n 

= P{A\C) 
2.4 Consistency constraints 

We are now ready to investigate the constraints that arise due to simultaneous measurement of 
commuting observables and that will play a crucial role in understanding the differences between 
quantum theory and p-GNST. Imagine two commuting measurements [Mi,Mj] = 0, and their 
product Mfc = MiMj. In quantum mechanics the outcome of the measurement is the same as 
the product of the outcomes of Mj and Mj, which can be verified by expanding in terms of Mj 
and Mj and using the fact that they have a joint eigenbasis. What happens if we take this to be 
true in any theory? If we are only allowed to make local measurements, then this is a moot point. 
We can only get X 03 X by measuring X and X and multiplying the results. 

But if we are allowed to make any combination of commuting measurements, this will impose 
some interesting conditions. For example, in the quantum case we may have Mi = X (8) X, M2 = 
Z ® Z and M3 = XZ ® XZ. To see that this has consequences in terms of the moments, consider 
the simple example where ra{M\) = 1 and m{M2) = 1, which means that we will deterministically 
observe outcomes A{M\) = AiM-i) = 1. Hence, m(M3) = —1 should intuitively not be compatible 
with these two moments for M\ and M2. 

How can we formalize these conditions? For example, Eq. ([3| gives us that 

m(MiM2) = m(Mi,M2), 

if we insist that outcomes of products of measurements equal the product of outcomes of individual 
measurements. For a given set of commuting measurements C = {Mi, . . . , Mm} with M? = I, let 
s(M) be the 2"^ element vector whose A;-th entry is given by 

s:=\s{C)\u:=Ml-Ml-...M'z. (5) 

with k S {0, 1}*" in lexicographic order. We now define the moment matrix Kg by letting the entry 
in the i-row and j-th column be given by 

[Ks]ij := m(s,s,-)/2™. 

Claim 2.2 (Adapted from Wainwright and Jordan [30]). Let C = {Mi,..., Mm} he a set of 

commuting measurements. Then Kg > if and only if p is a probability distribution (satisfying 
constraints (1) and (2)). 

Proof. In addition to Kg, we define two more 2™ x 2™ matrices, whose components are labeled by 
vectors i,j G {0,1}*" in lexicographic order as 

[P],,=5,,p{A = {{-iy\...,{-ir-)\C). 
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It is easily verified that S is a unitary matrix. Note that B is an example of a Hadamard matrix. 
Now we will show that Kg = BPB^ . 



BPB^ 



1 



2- ^ {-If 5up{{{-lf\...,{-lf-)\C){-lf^ 
fc,«e{o,i}™ 

fce{o,i}™ 

^ m 

fce{o,i}™ t=i 

^ 771 

= ^^i^i^i) = [Ks]ij 

Clearly, if the probabilities p{A\C) are non-negative (2), then P > if and only if > since B 
is unitary. Similarly, the fact that mil) = 1, i? is unitary and the trace is cyclic ensures that p 
satisfies condition (1). □ 

Example 2.3. As an example, consider the case of two commuting measurement Mi and M2 with 
M3 = M1M2. We have s = (I,Mi,M2,M3) and 



/ m(I) m(Mi) m(M2) m{MiM2)\ 

m(Mi) m(I) m{M^) m(M2) 

m(M2) m(M3) m(Mi) 

\m{Ms) m{M^) m{Mi) m(I) / 



Demanding that the eigenvalues of this matrix, A = ((1 + a — 6 ■ 



/I 


a 6 c\ 


a 


1 c 6 


h 


c 1 a 


V 


b a l) 




-l + a + b 



c),(-l + a-6 + 



c), (1 + a + 6 + c)), 6e non-negative is enough to ensure that ^ 0. Using the Sylvester criteria, 
we get the alternate constraints that each moment |a, 6, c| < 1 and 1 — — b"^ — c^ + 2abc > 0, and 
A1A2A3A4 > 0. 

Our examples are reminiscent of the examples considered in the setting of contextuality [30] . 
Note that our constraints are related, but nevertheless of a different flavor since we only consider 
such constraints for measurements which all commute. It may be interesting to consider such a 



moment matrix in order to determine how "non-contextual" quantum theory is. In section 4.1 and 
[3] we will develop classes of states which are restricted by imposing specific relationships among 
various moments. In particular, it will be of crucial importance whether we merely impose such con- 
straints for measurements acting on different systems, or include such constraints for all commuting 
measurements. 



3 j9-nonlocal theories and their properties 

We now define a series of so-called p-nonlocal "theories", each one more constrained than the 
previous. Our definition is thereby motivated by the uncertainty relations of jl2] stated above. 
We later relate our definitions to Barrett's GNST [8^ and what are commonly known as non-local 
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boxes. Our aim by constructing this series of simple theories is thereby merely to gain a more 
intuitive understanding of superstrong non-local correlations due to non-local boxes. 

3.1 A theory without consistency constraints 

We start with the simplest of all p-theories, which forms the basis of all subsequent definitions. In 
essence, we will simply allow states violating the uncertainty relation in [T] without worrying about 
anything else. In the spirit of Barrett jS] we start by defining the states which are allowed in our 
theory, and then allow all linear transformations preserving the set of allowed states. For simplicity, 
we will only consider the case of d = 2". 

Definition 3.1. A d- dimensional p-h'm state is a d x d complex Hermitian matrix 

satisfying 

1. for all a, h, —1 < Sab ^ 1- 

2. for any set of mutually anti- commuting strings of Paulis Ai, . . . , Am G C'^^'^ 

^iTv(^,p)r<i. 

It remains to be specified what operations and measurements we are allowed to perform on 
p-h'm states. We define 

Definition 3.2. A d- dimensional p-h'm theory consists of 

1. states p € Sp where Sp is the set of d- dimensional p-bin states, 

2. linear operations T : Sp ^ Sp, 

3. measurements described by observables Sab = S'^b ~ ^ab where and 5^^, are projectors onto 
the positive and negative eigenspace of Sab respectively. As in the quantum case we let 

po = Tr(^5'°J and pi = Tr(pS'^J. 

Starting from a state, we may apply any set of operations T followed by a single measurement. 

Note that by virtue of Eq. ([T]) any quantum state is a p-bin state. Note that the converse 
however does not hold, since the conditions given above do not imply that a p-bin state p is 
positive semi-definite. It seems very restrictive to limit ourselves to a single measurement at the 
end. The reason for this is that for some p, there exist p-bin states to start with, valid operations 
and measurements, followed by another operation that give us a states that are no longer a p-bin 
states |12]. We return to this question, when we consider the set of allowed operations below. 
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Note that the above definition is well-defined. First, we want that for any measurement Sab, 
{po,Pi} forms a valid probability distribution. A small calculation gives us that any p-nonlocal 
state p we have 



and thus < p;, < 1 and po+Pi = 1- Second, we want the non-signaling conditions to hold. When 
measuring Sab ® •S'a'fe' on a bipartite state 



we have that the probability to obtain outcome u for the measurement on the first system is given 



veo,i 

and hence Fr[u\ab, a'b'] = Pr[u\ab,a"b"] for all a',b',a",b" as desired. A similar argument can be 
made to show that the more general independence condition is satisfied. 

3.1.1 Basic Properties 

We now state some basic properties of this theory, which will also hold for a more restricted p- 
nonlocal theory as outlined below. 

Claim 3.3. If p is a p-bin state, then p is also a q-bin state for p,q £ Z with q > p. 

Proof. This follows immediately from the fact that for any r G [0, 1] we have < r^. □ 

Below, wc will apply circuits consisting of the Clifford gates {CNOT, X, Z,Y, H} and I. It is 
easy to see that such unitary operations are allowed transformations taking p-bin states to p-bin 
states. 

Claim 3.4. Let p e S^. Then for any circuit U consisting solely of the gates {CNOT, X, Z, Y, H, 1} 
we have U pU^ G S^. 

Proof. Note that U is composed of single unitarics Uj = (giF Cgil"'"-^ with V G {X, Z, H} and 
unitaries C/j = F~^®CNOT(X'I"^''^^. First, it is straightforward to verify that for any a,b £ {0, 1}", 

there exist a',b' G {0,1}" such that UjSabUj = Sa'b', and similarly for C/j. Second, applying a 
unitary to any set of anti-commuting operators again gives us anti-commuting operators. Hence, 
since we have J2j I '^(^jP)l^ ^ 1 ^'^^ ^''^V ^^t of anti-commuting strings of Paulis, the resulting state 
will also have this property. □ 

It will also be useful to know that 



p, = TT{ps:,) = -{i + {-irsab) 




by 




Claim 3.5. Let p\ 



,...,PneS^. Then 0"^^ pi 



'2" 
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Proof. We proceed by induction. By assumption, pi G S"^. We will show that for any states 

p G 5p" ,aeSp, the state p^ a e 5p"^^ . 

We need to prove that for any set of mutually anti-commuting Pauli's Aj G i^2"+^x2"+^ 
I Tr(yljp <Si a)\^ < 1. Each Aj can always be written in terms of a Pauli, Bj acting on p, plus a 

Pauli {I, X, Y, Z} on a. We separate the Aj into groups according to which Pauli is appended to 

Bj. Then we can rewrite this as 

I Tr((iJ,, 0l){p0 aW + E I ^((^^- ^X){p0 aW 
ji jx 

jy jz 

= ^ I Tr{Bj,p)\P + I Tr{B,,p)n TriXa^ 

ji jx 

+ ^ I Tt{B,,p)\p\ TT{YaW + Yl I T^iB,,p)\P\ TT{Za)f < 1 
jr jz 

Since all the Aj mutually anti-commute, then for different {Bj (g) X,Bji ® X] = implies 
{Bj. By} = 0, while {Bj ® X,Bj, ®Y] = {) implies [Bj,Bj>] = 0. Then because p e S^" and 
{Bj^,Bji^} = 0, and, for similar reasons {Bj^,Bj^} = {Bj^,Bj^} = 0, we know 

Y,mBj,p)\P + Y,\Tr{Bj,)\P <1 

ji jx 

Now we will shorten our notation by writing 

ax = \TT{Xa)\P bx = E^jT^iB^^P^ 
ay = \TriYa)\P V = E,, I Tr(S,,p)|P 
az = mZa)\P bz = EjJMBj,p)\P 
bi = Ej,mBj,p)\^ 

This allows us to write inequalities implied by the uncertainty relation like: 

Ox + ay + oz < 1 
bx + bi<l 
bY + bi<l 
bz + bi<l 

We can also see that ax,aY,az, bx, bY,bz, bi > 0. The task at hand is to show that these inequal- 
ities imply the one required of a state in 5^" , which we can now rewrite as 

axbx + ayby + azbz + &i < 1- 

We do this by writing down a sum of products of non-negative quantities like 1 — ax — ay — az 
and noting that the result is non-negative. 

ax(l - bx - bi) + ay(l - by - bi) + az{l -bz - bi) + (1 - - ax - ay - az) >^ 

That equation can be rewritten as 1 — {axbx + a,yby + azbz + ^i) > 0, which is what we set out 
to show. Therefore, p (8) a is a valid state, and, by induction, so is (^"^^ Pi G 5^" for any n. □ 
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3.2 An analogue to box-world 

Note that in the above definition we have not placed any constraints at all on the expectation values 
of commuting measurements. This was not necessary, as we had allowed a single measurement only, 
where by the above definition X formed such a single measurement. Now consider a two-qubit 
system, i.e., d = 4. Suppose that we have for a particular p that 

Tt {{X I)p) = TV ((I X)p) = TV ((X X)p) = -1. 

Note that p can be a perfectly valid state with respect to the definition given above, but yet 
we would not consider this to be consistent behavior, if we were allowed to perform subsequent 
measurements. We now introduce additional constraints that eliminate this inconsistency. It should 



be clear from section 2.3 that that to achieve full consistency we would have to introduce certain 



constraints for commuting observables in general. Yet, we will first restrict ourselves to observables 



on different systems in analogy to "box-world". We will show in section 4.1 that Barrett's GNST 
and non-local boxes essentially correspond to this definition. We will also see in section |6] and 7.1 
that these additional constraints play a crucial role in the power of our model with respect to 
information processing tasks. 

Definition 3.6. A p-box state is a p-bin state p, where in addition we require that for any set 
C ^ £l of measurements acting on different systems and s{C) as defined in Eq. ^ we have that 



the corresponding moment matrix Kg defined in section \2^ satisfies 

K., > 0. 



Note that claims 3.3 and 3.5 holds analogously for p-box states. It is important to note though 



that claim 3.4 does not hold in this case, since for example the CNOT operation can lead to states 



violating the definition. 

3.3 A theory with consistency constraints 

Finally, we will impose all constraints required from our consistency considerations of section 2.3 



Definition 3.7. A p-nonlocal state is a p-hox state p, where in addition we require that for any set 
of commuting measurements C ^ £c and s{C) as defined in Eq. we have that the corresponding 
moment matrix Ks as defined in section satisfies 

K, > 0. 



Again claims 3.3 and 3.5 hold analogous to the above. When we include all consistency 



considerations, it is also easy to see that claim 3.4 holds for p-nonlocal states, since for any al- 



lowed unitary U we already have by the above that p satisfies the constraints given by the set 
C' = {f/t Ml [/,... , U^MmU} and hence UpU^ remains a valid p-non-local state. 

4 Generalized non-local theories 

To create a closer analogy between our "theories" derived from relaxed uncertainty relations and 
non-local boxes, we now consider a related class of theories called generalized no-signaling theories 
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(GNST) [8], for which we wih consider similar relaxations. As already sketched in the introduction, 
states in a GNST are defined operationally. Consider a laboratory setup where we have a device 
which prepares a specific state. We then use a measuring device which has a choice of settings 
allowing us to measure different properties of the system. The measuring device gives us a reading 
specifying the outcome of the measurement. A particular state in GNST is described completely 
by means of the probabilities of obtaining each outcome when performing a fixed set of fiducial 
measurements. For example, for a set of fiducial measurements O = {X, Z, Y} with outcomes 
A = {±1}, the probabilities p{A\C) for all A £ A and C G O form a description of the state. 
Hence, we will simply use p to refer to a state given by said conditional probabilities. The idea 
behind considering fiducial measurements stems from the idea that there exists a set of measurement 
choices that suffice to fully describe the system. In classical mechanics, for instance, we can always 
in principle make a single measurement which outputs all the information necessary to describe 
a state. For a qubit, on the other hand, we would need results from at least three different 
incompatible measurement settings, e.g., spin in three orthogonal directions. We refer to |8j for a 
definition of GNST and its allowed operations. For us it will only be important to note that similar 
to the setting of non-local boxes, we can make only one measurement on each system, and there is 
no real notion of post-measurement states defined. 

In the following, we will be interested in the special case of multi-partite systems where on each 
system we can perform one of three fiducial measurements with outcomes ±1. Using our notation 
from section |2.2| we write the set of realizable experiments for GNST as 

£g = G {1,2, 3}" : , . . . , Wn,kJ}, 

with Wi^ki denoting a choice of the kith measurement on the ith system. Later we will connect these 
measurement choices with Pauli measurements via the relation Wi^i = Xi, Wi^2 = Zi, Wj^s = XjZj. 
A key point of this definition will be that the partitioning of measurements into n systems will 
be fixed. We also demand that probability distributions should satisfy an independence principle. 
As we pointed out, when restricted to partitions over disjoint parties, this just reduces to the no- 
signaling principle. That is, the choice of measurement on one subset of particles can not be used 
to send a signal to a disjoint subset. 

In analogy to the quantum setting [8], we let one gbit refer to a single system on which we 
can perform our set of fiducial measurements given above. Our definition of a gbit thereby slightly 
differs from the definition given in [8^, which only allows two fiducial measurements X and Z on a 
single gbit. Yet, in order to compare the hierarchy of GNST-like theories we will construct below 
to the p-hox states from above we adopt this slightly more general definition in analogy to a single 
qubit in the quantum case. Note that for the set of measurements C £ £g specified above, an 
n-gbit state, specified by p : A^"" x C — > [0, 1], is in GNST if p satisfies constraints (1), (2), and 



(3') in section 2.2 



Example 4.1. Consider the following state of one particle in GNST (or one gbit): 

p{A = +l\M = X) = = 1 - p{A = -1\M = X) 
p{A = +l\M = Z) = sy = 1 - p{A = -1\M = Z) 
p{A = +1\M = XZ) = = l-p{A = -l\M = XZ) 



This state is normalized, and positivity TGQuiTcs Sx-jSy^Sz G [0,1]. TfiG stdtc would he eQuivalent to 



the state of an arbitrary qubit if and only if s'^ -\- + s'^ < 1, that is, if we are constrained to the 



Bloch sphere. 
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For multi-partite states the difference between constraints on qubits and gbits becomes more 
complicated. We now turn to describing a hierarchy of constraints on GNST theories which will be 
analogous to uncertainty conditions in p-nonlocal theories and quantum mechanics. 



4.1 p-GNST 

Even though states in GNST are defined without any particular structure to their measurements 
embedded, we will now impose a physically motivated structure. In particular, we will simply 
imagine in analogy to the quantum setting that measurements X, Z and Y obey the same anti- 
commutation relations as the Pauli matrices {X,Z} = {Z,Y} = {X,Y} = 0. In our definition 
below, we will for simplicity write {•, •} to indicate that we imagine such an anti-commutation 
constraint to hold exactly when the string of Paulis Wi,ki associated with each C would anti- 
commute. 

First of all, this will allow us to artificially impose an uncertainty relation just like Eq. ([T]). 

Definition 4.2. A state is in p-GNST if it is in GNST and for any set of measurements S = {C S 
£g} satisfying that for all C, C G S, {C, C'} = we have 

(6) 

c&s 

Note that for p ^ oo this condition no longer restricts the states, because we get maxcg^ |m(C) | < 
1, which is true for the original GNST, and non-local boxes. If we would actually add such com- 
mutation and anti-commutation constraints we could now again distinguish between adding the 



consistency constraints of section 2.3 only for measurements acting on different systems, or for all 
commuting measurements in analogy to the p-box and p-nonlocal theories. In analogy to GNST, 
where commutation relations were only defined for measurements acting on different systems how- 
ever, we will stick to this setting, even when considering p < oo. A p-GNST state is thus essentially 
analogous to a p-hox state, except we are allowed to make simultaneous measurements of locally 
disjoint systems. 



5 Superstrong non-locality 

Before we show that relaxing the uncertainty equation of Eq. ([T]) leads to superstrong non-local 
correlations, let's take a look at what effect this uncertainty relation actually has on quantum 
strategies for the CHSH inequality. For this purpose, we will rewrite Tsirelson's bound for the 
CHSH inequality in its more common form as 

I {Aq Bo) + {Ao^Bi) + {Ai ® Bo) - {Ai (E) Bi)\ < 2V2, 

where we use Ao,Ai and Bo,Bi to denote Alice's and Bob's observables respectively where Aq = 
Ai = Bq = B\ = I. We will use the fact that in order to achieve the maximum possible quantum 
violation we must have {^0,^1} = and {Bo,Bi} = [H EEl El] . For Mi = ^0 ® ^o, M2 = 
Ao ® Bi, = Ai ® Bo and M4 = Ai® Bi this means that we have {Mi, M2} = {Mi, M3} = 
{M2, M4} = {M3, M4} = 0. Using the uncertainty relation of Eq. ([TJ proving Tsirelson's bound is 
equivalent to solving the following optimization problem 
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maximize (Mi) + (M2) + (M3) - (M4) 
subject to (Mi)2 + (Ms)^ < 1 

(Mi)2 + (M3)2 < 1 
(M2)2 + (M4)2 < 1 
(M3)2 + (M4)2 < 1 

By using Lagrange multipliers, it is easy to see that for the optimum solution we have (Mi)^ = {M^)"^ 
and (M2)^ = (Ms)^. By considering all different possibilities, we obtain that with x = (Mi) = 
— (M4) and y = (M2) = (M3) our optimization problem becomes 

maximize 2(x + y) 
subject to + j/^ < 1 

Again using Lagrange multipliers, we now have that the maximum is attained at x = y = l/\/2 
giving us Tsirelson's bound. 

Tsirelson's bound can hence be understood as a consequence of the uncertainty relation of [12]. 
Thus, we intuitively expect that relaxing this relation affects the strength of non-local correlations. 
In a similar way, one can view monogamy of non-local correlations as a consequence of Eq. (ITI [35]. 



5.1 CHSH inequality 
5.1.1 In p-theories 

To see what is possible in p-theories, we first construct the equivalent of a maximally entangled 
state. Let 

^ "+{^] (X + Y) 



Note that for p ^ 00 this gives us 

p^ = ^[I + X + Y]. 
We now proceed analogously to the quantum case to construct 

r/i = CNOT(/>p |0)(0|)CNOTt, 



which by claim 3.4 is a valid p-bin and p-nonlocal state. It can also be verified that rji forms a valid 
p-hox state. 

Claim 5.1. Let Ai = X , A2 = Y , Bi = X and B2 = Y he Alice and Bob's observables respectively. 
Then 

(CHSHp) = TtivMi ^Bi+Ai^B2 + A2^Bi-A2^ B2)) = 4^, 
for all p-theories. 

Proof. This follows immediately by noting that 



4 



^(l+^{X^X + X(g)Y + Y^X-Y^Y) + Z^Z^ 



□ 
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We can also phrase this statement in terms of probabihties as stated in the introduction, by 
noting that the maximum probabihty that Ahce and Bob win the CHSH game is given by 

1 ^ jCHSHp) ^ 1 ^ 1 



2 8 2 2-2i/p' 

It is important to note that this violation can be obtained even when imposing the additional 



consistency constraints from section 2.3 



5.1.2 In p-GNST 

We already saw in the introduction that GNST admits states analogous to a non-local box, al- 
lowing for a maximal violation of the CHSH inequality. We now show that similar states exist 
for p-GNST theories analogous to p-hox states. We first phrase the CHSH inequality in terms 
of probabilities. In particular, consider the GNST state specified by p{{Ai, A2)\{Mi, M2}) = 
\{1 + {-l)^'"i^Zi^M2,z2AiA2X) for some A to be chosen below. If each party measures X 01 Z 
on their state and outputs the result ±1, the probability that Alice and Bob win the CHSH game 
is given by 

^{p{l,l\Xi,X2) +p{-l,-l\Xi,X2) +p{l,l\Xi,Z2) +p{-l,-l\Xi,Z2) 

+p{l,l\ZuX2) + p{-l,-l\Zi,X2) + pil,-l\Zi,Z2) + p{-l,l\ZuZ2)) = 

In terms of the moments, m{Xi, X2) = m{Xi, Z2) = m{Zi, X2) = —m{Zi,Z2) = A, and this 
becomes 

^(2 + X2) + m{Xi, Z2) + m(Zi, X2) - m(Zi, Z2))) = ^ 

Now we can consider the maximum value of A that is a valid state in p-GNST. The require- 
ments listed in example 2.3 only restrict |A| < 1. Eq. ^ requires \m{Xi, X2)\^ + \m{Xi, Z2)\^ = 



\m{Zi,X2)\P + \m{Zi, Z2)\P = 2\X\p <1^X = {^)p. Therefore in a p-GNST it is possible to win 
the CHSH game with probability 1/2 -M/(2 • 2^?). 



5.2 XOR games 

We now investigate the case of general 2- player XOR-games for p ^ 00. In such a game we have 
an arbitrary (but finite) set of questions S and T from which Alice's and Bob's questions s £ S 
and t £ T are chosen according to a fixed probability distribution vr : S" x T — > [0, 1]. Yet, the set 
of possible answers remain A = B = {0, 1} for Alice and Bob respectively. The game furthermore 
specifies a predicate V:AxBxSxT^ {0, 1} that determines the winning answers for Alice and 
Bob. In an XOR game, this predicate depends only on the XOR c = a © 6 of Alice's answer a and 
Bob's answer b. We thus write V{c\s,t) = 1 if and only if answers a (B b satisfying a © 5 = c are 
winning answers for questions s and t. We will also restrict ourselves to unique games, which have 
the property that for any s, t, b, there exists exactly one winning answer a for Alice (and similarly 
for Bob). 

First of all, note that in the quantum case we may write the probability that Alice and Bob 
return answers a and b with a © 6 = c as 

p{c\s,t) = ^{l + {-iy{^\As0Bt\^)), 
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where we again use As and Bt to denote Alice's and Bob's observable corresponding to questions 
s and t respectively and |^') denotes the maximally entangled state. Note that we again have 
(Ag)'^ = {Bt)"^ = I from the fact that both measurements have only two outcomes. The probability 
that Alice and Bob win the game can then be written as 

Y,<s,t)Y,V{c\s,t)p{c\s,t). 

s,t c 

Let Vst = {^\As (8) Bt\^). First of all note that for p ^ oo 

^(i + Y^vstrs^r}j (7) 

with d = 2™*^^!"^!'!^! and T^jFt anti-commuting observables as defined in section [2] is a valid state 
for any \vst\ < 1- Hence, we can immediately see that 

Corollary 5.2. In any 00-theory, there exists a strategy for Alice and Bob to win a unique XOR 
game with certainty. 

Proof. Consider the state given in Eq. ([7| with Vst = ±1 such that p{c\s, t) = 1 whenever T^(c|s, t) = 
1. Let Alice and Bob's measurements be given by Tg and Tt for questions s and t respectively, 
which are valid measurements for all p-theories with rs,rj constructed as in section [2] □ 

We leave it as an open question to examine the case of p < 00 for XOR games, since our aim was 
merely to show that superstrong correlations can exist, if we allow for relaxed uncertainty relations. 
We can see that letting Vgt = ±l/(max |r|)-^/P makes Eq. ^ a valid state for any choice of 
p, but this may not generally be the optimal choice. The case of GNST is similar, and it has 
been shown that any non-local correlations can (approximately) be simulated by such boxes [23] . 
Optimal bounds for p-GNST with p < 00 can be obtained using techniques analogous to |22] . 

6 Superstrong random access encodings 

The existence of superstrong non-local correlations is by no means the only difference we can 
observe when moving from quantum theory to p-GNST or p-nonlocal theories. In particular, 
we now show that we can obtain so-called random access encodings which, depending on the 
theory, can be exponentially better than those realized by quantum mechanics. We then investigate 
how uncertainty relations and the restrictions imposed by simultaneous measurements affect this 
encoding. The existence of such random access encodings will play a crucial role when considering 



the power of p-GNST theories for communication complexity in section 7.1 In section 7.2 we also 
use this random access code to prove a lower bound on the sample complexity of learning states in 
GNST. 

6.1 In p-GNST 

Intuitively, a random access code j2l |3] allows us to encode N bits into a physical system of size 
n such that we can decode any one bit of the original string with probability at least q. More 
formally. 
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Definition 6.1. A [N,n, q\-random access code (RAC) is an encoding of a string x S {0,1}^ into 
an n-ghit state px, such that there exist measurements C G £g with outcomes A £ A^"", and a 
decoding algorithm D : — > {0, 1} satisfying 

Pr{D{A) = Xk)= ^ 5DiAuMA\C) > q, 

where px{A\C) is the probability of obtaining outcome A when performing the measurement C . 

It has been shown [21 [3] that in the quantum case, we must have n > {\ — h{q))N, where 
h denotes the binary entropy function. There also exist classical encodings for which n = (1 — 
h{q))N + 0{logN) [2]. Hence, quantum states offer at most a modest advantage over classical 
mechanics and, for g = 1, no advantage at all. We now proceed to the surprising result that general 
no-signaling states lead to extremely powerful random access codes. 

Claim 6.2. In GNST, there exists a [3", n, 1] -random access code. 

Proof. An n gbit state in GNST is completely characterized by the probabilities of outcomes for 
a fixed set of measurements. Recall that a single gbit is a two-level system on which we allow 
three possible measurements with two possible outcomes each. Also recall that each C £ £g can be 
represented as £g = {Vfc G {1, 2, 3}" : {Wi,k„ ■ ■ ■ , H^n.fcJ}, with W^^i = X,, W,^2 = ^i, ^^,3 = ^i^i- 
Note that each measurement C is associated with one of A'' = 3" vectors k = (fci, . . . , /cn)- Let 
f : C {1, . . . , N} be a one-to-one function. For each of the A = 3" bits we wish to encode, we 
must specify one measurement C that we can use to extract the jth-bit. Let that measurement be 
denoted by 

We are now ready to define our encoding of the string x G {0, 1,2}^ into an n-gbit GNST state 
Px via the probabilities 

Px{A\C) :=i-(l + ^*(-l)-/(C)), 

where we use the previously defined notation A* = Yl^Ji ^i- It is straightforward to verify that the 
state is normalized, positive, and satisfies the no-signaling condition. 

We now show that any bit of the original string can be decoded perfectly. If we choose to retrieve 
bit J, we measure C = f^^{j). That means that we get result A with probability 2^i(l+^*(— 1)^-') = 
27r2(5^. . And we get the result A* = (—1)^^ with probability: 

^ Px{A\C)= ^2<5^%{-ip = 1- 

A*={-lf:> A* = {-lf:> 

where the last equality follows from the fact that we sum over exactly half the 2" possible outcomes 
^1, . . . ,An. Hence the decoder D{A) = |(1 — A*) will return xj with perfect probability. □ 

What happens if we impose the uncertainty relation in p-GNST? For convenience sake, note 
that we could rewrite the encoding above in terms of moments, where we let an encoding of a string 
x be determined by the moment representation of px as 

mx{C = f-\k)) := i-lf'' 

with all other moments set to 0. 
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To construct an encoding for p-GNST, we consider 



m,{C = f-'{k)):={-irn. 

What's the largest A that satisfies the uncertainty relation? As we noted earlier the maximum 
number of anti-commuting Pauli operators is 2n + 1, so the most restrictive condition we could get 
from the uncertainty relation is (2n + 1)|A|p < 1. We thus obtain 

Claim 6.3. In p-GNST, there exists a [3",n, | + ^ (^ 2n+i ) ]-random access code. 

Proof. Let A = (2n + 1)^^^, and note that this satisfies the uncertainty relation. Our encoding is 
now 

p,(A\C) = ^{l + {-irnc) XA*). 
And our probability of getting the correct sign from our measurement goes down to 

^ ^ ^ ' 2 2 2 V2n + 1 J 

□ 

If p < oo we get an encoding that gets asymptotically worse for large n. This should be 
compared to the bound on the number of qubits for a quantum random access encoding of A'^ = S"' 
bits into k qubits with recovery probability q = 1/2 + l/2(l/(2n + From the bound of |2l[3], 

we have that the encoding uses exponentially fewer physical bits than what can be obtained in the 
quantum setting and hence even p-GNST has a powerful coding advantage over quantum mechanics. 
Note that we are always free to split the N bits into smaller pieces first, and encode each piece 
independently to keep the recovery probability q constant. This is analogous to the quantum setting 
where we can encode each 3 bits into one qubit to obtain a random access code with n = N/3. 
Alternatively, we can form a simple repetition code, where we have k copies of the random access 
codes constructed above. We then have 

Claim 6.4. In p-GNST, there exists a [3", (2n + 1 — e]-random access code with e = 

2exp(-(2n + l)i/P/2). 



Proof. We take k copies of the RAG defined in Glaim 6.3 and decode by taking the majority of 
the individual encodings. Let 1^- = 1 if the decoding was successful for the j-th copy, and 1^- = 
otherwise. From the Hoeffding inequality we immediately obtain that for Y = X]j=i q as 

defined above 

Pr[\Y-qk\ >tk]< 2e-2*'^ 

If we set t = q-l/2 = l/2(l/(2n + 1))^/^, that gives us Pr [y < k/2] < 2e~H2^)'^'''=. Now if we 
set k = {2n + if/'P, we have used a total of (2n + 1)^/Pn gbits and will succeed with probability 
1 - 2e-(2"+i)'^V2 as promised. □ 

Whereas (2n + l)3/P n is still quite large, note that it is nevertheless only polynomial in n. The 
length of the RAG is hence still poly- logarithmic in our original input size, where we achieve (near) 
perfect recovery for large n. Finally, we will need to use one more related result. 
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Claim 6.5. Inp-GNST, for e (0,1/2) andn > 22/Pln(4/(l/2-7)2), there exists a [3'''^^'P''^\n,l + 
^]-random access code with n{h,p,j) = [(^ 1^(4/(1/2-7)^) ^ '^^^^^ 



Proof. Again we take k copies of the RAC defined in Claim 6.3 and decode by taking the majority 
of the individual encodings. The probability to decode correctly in that case was 1 — 2e 2 ^ 2n+i > 
Now we want to adjust k and n to get a code with a fixed success rate and that uses no more than 
h gbits. We need that (i) kn < h, that is, our encoding uses at most n physical bits and (ii) 
1 — 2e~2(2n+i) ''^ > 1/2 + 7, which forces our probability of success to be at least 1/2 + 7. We 
can satisfy (ii) if we set k = ln(4/(l/2 - 7)^)(2n + 1)^/^, then (i) tells us that kn = ln(4/(l/2 - 
7)2) (2n + 1)2/Pn, from which we have ln(4/(l/2 - 7)2)22/Pn2/P+i < kn < h and thus 



n < 



l^ln(4/(l/2-7)2)J 



Since the smallest system we can encode into is n = 1, this tells us that h must be at least 
22/Pln(4/(l/2-7)2). □ 

Note that although this may not be the best encoding, it suffices to give us the asymptotic 
behavior for h. 

6.2 In p-nonlocal theories 

It is instructive to consider such superstrong encodings in the language of p- nonlocal theories to see 
how such superstrong encodings would look like in terms of Pauli matrices. This will also allow us 



to compare the consequences of restrictions due to the consistencies of moments from section 2.3 
to random access encodings. For the least restrictive p-theory, the p-bin theory, we can construct 
the following very simple encoding. 

Claim 6.6. In p-bin theories, there exists a [2"^^ — l,n, ^ + ^ ( 2^+1 ) ]-random access code. 
Proof. Consider the encoding of a string x G {0, 1}^ with N = 2^" — 1 into an n p-bit state given 

by 




d \ (2n + 1)1/: 



where = Sab is a string of Pauli matrices, where we simply relabeled the indices ah. To decode 
the /cth-bit, we measure Sk- A straightforward calculation shows that the probability to obtain 
outcome Xk is given by 

Pr[xfc] = J TV [(I + Sk) Px] = \+ 



2 "''^"'^ 2 2(2n + l)i/p' 

as promised. Clearly, the uncertainty relation is satisfied. □ 

Similarly, we obtain the following encoding for p-box theories, which is in one-to-one correspon- 
dence with the encodings in p-GNST above. 
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Claim 6.7. In p-box theories, there exists a [3",n, 5 + 5 ( 271+1 ^ \-random access code. 



Proof. Our encoding is analogous to the one above, but we restrict ourselves to including only such 
strings of Pauli matrices formed by taking tensor products of {X, y, Z}, excluding the identity. □ 

Clearly, we can again obtain an encoding that is poly-logarithmic in the length of the original 



input analogous to Claim 6.4 that has perfect recovery for large n. 



6.3 The effect of consistency 

When viewing such encodings in terms of density matrices, it becomes clear why such encodings 
do not exist in a quantum setting: all such encodings are in gross violation of the consistency 



conditions of section 2.3 Even when we restrict ourselves to p = 2, we can obtain such encodings 
whereas in the quantum case we cannot. It is interesting to note that for p = 2, the violation 
we can obtain for e.g. the CHSH game is exactly the same as in the quantum setting. Thus it 
is perfectly possible to have such superstrong encodings, while simultaneously being restricted to 
Tsirelson's bound in the CHSH game for a 2 qubit state. This clearly shows how limited our p-bin, 
p-nonlocal, but also p-GNST theories really are. Since GNST is equivalent to a theory based on 
non-local boxes, this also shows that considering such boxes is somewhat limiting, and possibly 
ignores some aspect present in quantum theory that are of importance for information processing. 



7 Implications for information processing 

We now turn to a number of interesting implications of p-GNST and p-theories to information 
processing. In particular, we will see that both allow us to save significantly on the amount of data 
we need to transmit to solve certain communication problems. In fact, we will see that there exists a 
task for which there exists an exponential gap between the amount of communication required when 
compared with quantum theory. Other information tasks on the other hand become more difficult. 
We will see that when trying to learn states approximately we need to perform exponentially more 
measurements in the case of GNST. 

7.1 Communication complexity 

Imagine two (or more) parties, Alice and Bob, who each have an input x E {0, 1}" and y £ {0, 1}" 
respectively, unknown to the other party. Their goal is to compute a fixed function / : {0, 1}^" 
{0, l}*" by communicating over a channel. The central question of communication complexity is 
how many bits they need to transmit in order to compute /. Typically, we thereby only require 
one party (Bob) to learn the result f{x,y). To help them reduce the amount of communication, 
Alice and Bob may possess additional resources such as shared randomness, entanglement, non- 
local boxes or communicate over a quantum channel, and may impose different measures of success. 
For example, they could be interested in computing / only with a certain probability instead of 
computing it exactly. It is well-known that if Alice and Bob can share non-local boxes, they can 
compute any Boolean function / : {0, 1}^"" {0, 1} perfectly by communicating only a single 
bit [39], which is even true when the non-local boxes have slight imperfections Here, we 

consider the case where Alice and Bob have no a-priori resources, however, we they are able to 
exchange p-GNST or p-nonlocal states over a suitable channel. 
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7.1.1 One-way communication 



We first of all make a very modest statement and show that in any one-way communication protocol, 
where Alice sends a single message to Bob, we are able to save a constant number of bits, when 
computing a Boolean function /. These savings are an immediate consequence of the existence of 
superstrong random access codes that we discussed in section [6| To communicate with Bob, Alice 
constructs the string 

m = /(x,0),...,/(x,2'^-l) 

and encodes m G {0, 1}^" into a random access code pm- To retrieve the correct answer. Bob 
simply retrieves bit Xy = /(x, y) from pm- Evidently, this type of saving is particularly interesting 
in the case where Alice and Bob would need to communicate n bits to compute /, which is the case 
classically and quantumly if / = IP is the inner product |20] . By Claims 6.2 6.3 6.7 and 6.6 we 
immediately obtain that 

Claim 7.1. Let p oo. Then in to compute the inner product Alice needs to transmit at most k 
bits to Bob, where 

[1 1 log 3)n /or p-GNST and p-nonlocal theories 
n/2 for p-bin theory 



7.1.2 Private information retrieval 

More striking though are the possibilities of p-GNST or p-theories for the task of private information 
retrieval: Here, one (or more) database servers each hold a copy of the database string x € {0, 1}"". 
A database user should be able to retrieve any bit Xi of his choosing, while the servers should not 
learn the desired index i. A protocol that satisfies these parameters is the trivial one, where the 
server simply sends the entire string x to the user. The question is thus, whether it is possible to 
perform this task by communicating less than n bits. If only a single server is used, it is known 
that the trivial protocol is optimal and we need to communicate 0(n) bits, even if we are allowed 
quantum communication [25^. It is clear that the superstrong encodings from above, allow us to 
beat this bound trivially, by asking the server to encode x into a superstrong random access code. 



Hence we have as an immediate consequence of Claims 6.2, 6.4, 6.6, and 6.7 we have 



Claim 7.2. In any p-GNST, p-bin, andp-box theory, there exists a single server private information 
retrieval scheme requiring O{polylog{n)) bits of communication for large n. 



7.2 Learnability 

We consider a scenario in which there is an unknown state for which we are trying to learn an 
approximate description. In particular, imagine some arbitrary probability distribution over pos- 
sible two-outcome measurements. We are given the expectation value for each measurement in a 
finite set picked according to this distribution. We then construct an approximate description of 
the state which agrees with all the expectation values we have observed so far. This description is 
considered to be good if it predicts the correct results for most future measurements drawn from 
the same distribution. The central question is how many measurement results we need to be able 
construct a good description. 

The existence of strong random access codes has implications for state learning. Aaronson [T] 
used an upper bound on the number of bits that can be encoded into an n qubit RAC to upper 
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bound the number of measurements needed to learn an approximate description of an n qubit state. 
He took solace in the fact that, despite the exponential number of parameters describing a quantum 
state, a linear (in the number of qubits) number of measurements suffice to learn an approximate 
description of the state. If an exponential number of measurements were really required, we could 
never hope to do enough measurements to verify the identity of quantum states of a few hundred 
particles. 

We show the converse for states in p-GNST. We use our constructions of random access codes to 
lower bound the number of measurements needed to learn an approximate description of the state. 
We find that an exponential number of measurements is required to find such a description and 
therefore one could never hope to do enough measurements to learn a description of a state with 
a modest number of particles, even approximately. This holds even for theories where p = 2 and 
the violation of the CHSH inequality is the same as for quantum mechanics. This demonstrates 
an unusually powerful theory which starkly contrasts with quantum mechanics and the p-nonlocal 
theory. 

We begin with a section defining the relevant tools: a definition of the learning scenario, and 
a measure of state complexity known as the "fat shattering dimension." We then restate a known 
lower bound on the number of samples needed for learning in terms of the fat shattering dimension. 
In the next section, we derive lower bounds on learnability for p-GNST theories. First, we use our 
random access codes to lower bound the fat shattering dimension for p-GNST states. Then we can 
use this result to lower bound the number of samples needed to learn p-GNST states. 



7.3 Tools 

We begin by introducing some terminology from statistical learning theory. Let the set S denote 
the sample space, which will correspond to the space of possible measurements in our case. A 
probabilistic concept over S is just a function F : 5 — > [0, 1], and is equivalent to a state which maps 
measurement choices to expectation values. A set of such concepts is referred to as the concept class 
C over S and corresponds to the set of all states. We consider the learning situation in which you 
are given the value of the target concept (state) over some samples drawn independently according 
to an arbitrary distribution. The goal is to output a hypothesis concept that will give values close to 
the target concept for most samples drawn from the same distribution. A sample size that is large 
enough to allow this to be accomplished with high probability is said to be sufficient. To restate 
the connection, in GNSTs we will say that a state corresponds to a concept, and a measurement 
on the state to a sample. We will make these notions precise before we demonstrate the connection 



between RACs and fat-shattering dimension in 7.5 



We adopt our definition of probabilistic concept learning from Anthony and Bartlett |1] . 

Definition 7.3 (Anthony and Bartlett Let S be a sample space, let C he a probabilistic concept 
class over S, and let D be a probability measure over S. Fix an element p ^ C, as well as error 
parameters s,r],^ > with j > rj. Let ko{rj,'j,e,5) be some function of the error parameters. 
Suppose we draw a training set of k samples T = (si, . . . ,Sk) independently according to D, and 
then choose any hypothesis ar & C such that |o"r (sj) — /o(si)l ^ "H f^^ ^ '5- Then if for 

Fr[\aT{s)-p{s)\>j]<e 
with probability at least 1 — 5 over T, we say that ko is a sufficient sample size to learn C. 
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This says that if the size of the training set, k, is bigger than ko, then with probabihty 1 — 5, 
the training set T, that we pick according to T> will be a good training set. That is, a hypothesis 
concept a which matches the target state on the training set will only be different from the target 
state on some other sample with small probability, e. 

To define a lower bound on k^, we will need a measure of complexity called the fat-shattering 
dimension. 

Definition 7.4 (Aaronson [1]). Let S be a sample space, let C be a probabilistic- concept class over 
S, and let J > be a real number. We say a set {si, . . . , s^} C 5 is 'j -fat-shattered by C if there 
exist real numbers ai, . . . ,ak such that for all B C {1, . . . , A;}, there exists a probabilistic concept 
p £ C such that for all i G {1, . . . , k}, 

(i) if i ^ B then p (sj) < — 7, and 
(a) if i £ B then p (sj) > + 7. 

Then the ^-fat-shattering dimension of C, or fate (7); is the maximum k such that some 
{si, . . . , Sk} C 5 is "f- fat- shattered by C. (If there is no finite such maximum, then fate (7) = 00.) 

The fat-shattering dimension lower bounds the number of samples needed to learn a probabilistic 
concept. 

Lemma 7.5 (Anthony and Bartlett [1]). Suppose C is a probabilistic concept class over S and set 
< 7 < 7? < l,e, 5 G (0)1)- Then if fate {'y) > d > 1 and 7^ > 4d2~V^^ any sample size mo 
sufficient to learn C satisfies 

moiv, 7, e, 6) > max (^^^^.^^ " ' ^) 

This concludes the results we will need from statistical learning theory. 
7.4 Lower bounds on sample complexity 

Our next step is to show that the existence of random access codes lower bounds the fat-shattering 
dimension. First we have to carefully define what "concept" we will be learning and what constitutes 
our sample space. For the purposes of learning in GNSTs, the sample space is just the set of possible 
measurements, where we allow general measurements by first making some fiducial measurement 
on the state, and then post-processing the result using some decoding function. So we can define 
■Sgnst '■= {{C,D)\C G £g,D : A^"" {0,1}}. For some sample {C,D) £ Sgnst, a concept is 
specified by the state px in a GNST via the the probability px{C,D) := J^AeA^" ^(^)P^(^\^)^ 
where px is an n-partite state in some GNST. Then the concept class Cqnst is the set of concepts 
specified by all the states in GNST. 

Note that a "sample" is stronger than a typical notion of measurement. Usually we say that the 
measurement gives a result with some probability, but given some sample, the concept p actually 
returns the probability of that outcome occurring. This stronger notion of sampling is all we 
consider here since we are only lower bounding the number of samples needed. 
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Claim 7.6. Let the concept class Const over Sqnst consist of all px{C, D) = '^j^^_^D{A)px{A\C), 
where px describes any n-partite states in a GNST, over the sample space {{C,D)\C G £g-,D : 
>{0, 1}}. For integers n, N{p,n) and 7 G (0,1), if there exists an [N{p,n),n, ^ +'j]-RAC then 
fatcG^sT(7) > 

Proof. By the RAC definition, tliere exist a set of measurements {{C,D), . . . ,{C^^\D^^^)} and 
states specified by (the concepts) px for x G {0, 1}^ so that 

(i) if Xi = then /)^(C7«,dW) < | - 7 

(ii) if = 1 then /9^(C7W,D») > | + 7 

Therefore, this set of samples is 7 fat-shattered by Cqnst- Since fatcgjvsy ^^^^ largest 

sample set shattered, fatc^^gy > N{p,n). □ 



Combining Claims 6.5 with |7.6] and |7.5] we get the following result. 



Corollary 7.7. For h-partite concepts in Cp-CNST and error parameters £,r],^,d > with 7 > ?/, 
i/n > 22/Pln(4/(l -7)2) and 

( 1 / 3"(",P,7) \ 1 l\ 

k < max f; — ^ 1 , - In - 

I 32e I 21n2(4-3"(".P'T)/^2) I'g s 



for n{h,p, 7) 
Cp-GNST- 



ln(4/(l-7)^) 



2/p+l 



, i/ien A; is not a sufficient sample size to learn states in 



That is, we need 0(3"^''''^^ /n^/p+i ) samples to learn an n-partite state in p-GNST to great 
accuracy. For p = 2 we have an uncertainty relation analogous to quantum mechanics that rules 
out super-quantum violations of the CHSH bound. Nevertheless it still takes 0(3^/n) samples to 
learn these states, as compared to 0(n) in the quantum case. 





p-bin 


p-GNST/p-box 


p-nonlocal 


Quantum 


Classical 


Non-signaling 


yes 


yes 


yes 


yes 


yes 


Satisfies p-uncertainty 


yes 


yes 


yes 


p=2 


n/a 


Simultaneous 
measurements 


no 


local 


commuting 


commuting 


all 


CHSH violation 




1 + 1 


1 + 1 




3 
4 


2 ^ 21/p+i 


2 ^ 21/p+i 


2 ^ 21/p+i 


2 ^ 21/2+1 


RAC bits to 
encode N bits 


0(polylog(iV)) 


0(polylog(iV)) 


? 


n{N) 


n{N) 


PIR from N bits 


0(polylog(iV)) 


0(polylog(iV)) 


? 


n{N) 


n{N) 


"Learning" states 


hard 


hard 


? 


easy 


easy 



Table 1: Summary of properties and results for various theories. 
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8 Consistency of measurements 



9 Conclusion and open questions 

We have shown that relaxing uncertainty relations can lead to superstrong non-local correlations. 
This is quite intuitive when considering Tsirelson's bound as a consequence of such an uncertainty 
relation in the quantum setting. We then constructed a range of theories inspired by such relax- 
ations, and investigated their power with respect to a number of information processing problems. 
In particular, we obtained superstrong random access encodings and savings for communication 
complexity. At the same time, however, it turned out to become harder to learn a state in such a 
theory. We then discussed what makes such superstrong encodings possible in our p-theories, but 
also in GNST. We identified a number of simple constraints that prevent us from constructing a 
similar encoding in the quantum setting. Our work may indicate that using "box-world" to un- 
derstand any other problems within quantum information beyond non-local correlations may be 
difficult, as "box- world" differs from the quantum setting with respect to such constraints, at least 
when drawing a one-to-one analogy from a gbit to a qubit as in GNST [8, . It is important to note 
that these constraints did not prevent us from observing superstrong non-local correlations, but 
merely forbid our encodings in section [6| If one would like to use "box- world" to understand other 
aspects one could either impose such consistency constraints, or look for a different approach to 
defining such theories. GNST was defined by first specifying states and then allowing all operations 
that take valid states to valid states. If one would have specified the theory in terms of allowed 
transformations, instead of states, such encodings could also have been ruled out. For example, in 
the quantum setting one can transform operators X ® X, Z ® Z and XZ ® XZ into a bipartite 
form via a unitary operation. When looking at a density matrix expressed in terms of strings of 
Pauli matrices, its coefficients (which directly determine the moments for measurements of strings 
of Paulis) must obey similar constraints to the coefficients belonging to bipartite operators of the 
form 1® X,X ®1,X ® X for example. 

Finally, it is clear that both the uncertainty relation and the consistency constraints are obeyed 
in the quantum setting, since we demand that for any p we have Tr(yo) = 1 and /) > to be a 
valid quantum state. Not surprisingly, both forms of constraints are thus necessary (but in higher 
dimensions not always sufficient) conditions for p > 0. Such characterizations are not easy for 
d > 2 |26l [TOl EH HI] , and it remains an interesting open problem to find an intuitive interpretation 
for such conditions in higher dimensions, and their consequence for information processing tasks. 
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